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Abstract Strong lensing at the galaxy-scale is useful for nu- 
merous applications in Astrophysics and Cosmology. Some 
of the principal applications are studying the mass structure 
of elliptical galaxies, their formation and evolution, con- 
straining the stellar initial mass function, and measuring cos- 
mological parameters. Since the first discovery of a galaxy- 
scale strong lens in the eighties, this field has come a long 
way in terms of data quality and availability, and techniques 
to model the data. In this review article, we describe the 
most common methodologies to model lensing observables 
of galaxy-scale strong lenses, especially the imaging data 
due to it being the most available and informative source of 
lensing observables. We review the main results from the 
literature in astrophysical and cosmological applications of 
galaxy-scale strong lenses. We also discuss the current lim- 
itations of the data and methodologies, and provide a future 
outlook of the expected development and improvements in 
both aspects in the near future. 
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1 Introduction 


In this review article, we discuss applications of galaxy- 
scale strong lenses to study properties of the deflector galax- 
ies (i.e., the central lensing galaxies). These have so far typ- 
ically been massive elliptical galaxies at 0.1 < z < 1 and 
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strong lensing has been mostly applied to study their inter- 
nal structure and composition. However, a large sample of 
strong lenses with deflectors other than elliptical galaxies 
are expected to be discovered with the upcoming deep sky- 
surveys in this decade (see Saha et al. (in preparation)). We 
can also gain important insights into the formation and evo- 
lution of the deflectors by comparing their structural prop- 
erties (e.g., the logarithmic slope of the density profile, the 
dark matter fraction, etc) across cosmic times. Furthermore, 
in this review article, we present cosmological applications 
of the galaxy-scale lenses that do not require time delay in- 
formation —i.e., measuring cosmological parameters such as 
Q,,. Cosmological application involving the time delay mea- 
surements, i.e., measuring primarily the Hubble constant, is 
presented in details in Birrer et al. (in preparation). Addi- 
tionally, the application of galaxy-scale lensing to study sub- 
galactic structures of dark matter is presented in Vegetti et 
al. (in preparation). 

In this introductory section, we provide a brief descrip- 
tion of the lensing phenomenology (Sect. and discuss 
the advantages of strong lensing observables in comparison 
with other probes of galaxy mass, such as stellar dynamics 
(Sect. [I-2). The remainder of this review article is organized 
as follows. In Sect. |2| we highlight the major historical re- 
sults involving galaxy-scale lenses and introduce a number 
of prominent lens samples. In Sect.3] we describe the strong 
lensing observables, and their modelling and analysis meth- 
ods. We discuss the application of galaxy-scale strong lens- 
ing to study galaxy properties, formation, and evolution in 
Sect./4| and to constrain cosmological parameters in Sect.[5| 
Next, in Sect. |6| we discuss open issues — both in technical 
aspects and scientific questions — and provide some future 
outlooks. We conclude the review article in Sect.[7] 


1.1 Brief description of lensing phenomenology at the 
galaxy-scale 


In a galaxy-scale strong lens system, the background ex- 
tended source can be lensed into arcs or a full Einstein ring. 
If there is a point source within the background galaxy, e.g., 
an active galactic nucleus (AGN) or a supernova, then mul- 
tiple point-images would also appear. The former type of 
systems are called galaxy—galaxy lenses, whereas the latter 
are usually referred to as “quads” (for the case of 4 detected 
point images) or “doubles” (for the case of 2 detected point 
images). The different manifestations of strong lenses — ap- 
pearance of arcs or a full Einstein ring, or the number of 
point images — depend on the position of the source with re- 
spect to the lens caustics, as introduced in Saha et al. (in 
preparation). Strong lensing of point sources can provide 
three types of observables: image positions, image magnifi- 
cation ratios, and time delays between the images. It follows 
from the lensing theory that all three are properties of the 


Fermat potential, or the arrival-time surface: images form at 
the local extrema — minima, saddle points, and maxima — of 
this potential, magnification is inversely proportional to the 
determinant of its Jacobian matrix, and time delays are the 
differences in its height at the locations of the images (for a 
detailed explanation, see Saha et al. (in preparation)). 
However, not all of these observables are available for 
every lens. Image positions are almost always an observ- 
able for point-source lens systems. Image flux ratios, while 
always observable, can be turned to actual magnifications 
only when the intrinsic source flux is known. However, inter- 
preting these magnifications during modelling requires ex- 
tra care, as they can be affected by more complex features 
in the galaxy-scale mass distribution, e.g., baryonic disks 
(2017), microlensing by individual stars 
and planets in the lensing galaxy (see Vernardos et al. (in 
preparation)), some intermediate-mass-scale structures like 
dark matter subhalos (see Vegetti et al. (in preparation)), or 


dust extinction by the lens galaxy (e.g., 2002 
Mediavilla et al.}|2005). Time delays can be obtained only 


for variable point sources, like quasars, supernovae, and in 
the future, gravitational waves and fast radio bursts (see Bir- 
rer et al. (in preparation)). Measuring time delays can be 
a difficult task, especially for quasars, which require long- 
term monitoring spanning from a few seasons to years (e.g., 


Eigenbrod et al.| |2005} 2016} {Millon et al. 
2020). 


The above description of point-source lensing must be 
modified a little for extended sources that are comparable 
in size to the lens caustics. Instead of point-like image po- 
sitions, the light from such a source is spread into an ex- 
tended area in the image plane and then further smeared by 
the point spread function (PSF). As a result, it is hard to 
know a priori how the observed flux of the multiple images 
of the source on the lens plane is traced back to the same 
location on the source plane. This means that, even if the 
true source brightness is known, a lens model is required to 
compute the magnification field (i.e., the Jacobian of the lens 
potential) across the lens plane. Due to their size, extended 
sources are not variable on time scales that are relevant for 
lensing, thus time delays are not observable in the absence 
of a variable source. Note that for a lens system including a 
point-like source, extended arcs from the point-source host 
galaxy are also usually present, and thus the flux distribution 
of both the lensed arcs and the point-image positions can be 
utilized simultaneously to constrain a lens model. 


1.2 Unique advantages of lensing as a probe of galaxy 
structure 


Other than lensing, the only commonly used probe of the 
mass distribution in galaxies is kinematics, i.e., the veloc- 
ity dispersion or streaming motions of stars, gas, or globular 
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clusters. For this to be fully informative of the mass distri- 
bution, however, spatially resolved measurements are nec- 
essary (2016), which is often limited to nearby 
galaxies at z < 0.5 due to the large observational cost, or 
due to the apparent smaller size on sky of objects at higher 
redshifts. Using individual stars’ motions, e.g. from Gaia, to 
map a galaxy’s mass is only limited to our own Milky way 
(e.g., (2020), thus an aperture-integrated ve- 
locity dispersion measurement is the only possible dynami- 
cal observable for galaxies other than ours. However, strong 
lensing observables are obtained from high-resolution imag- 
ing data, which are much more informative than the dy- 
namical observables for galaxies beyond the local Universe 
(z 2 0.03, e.g.,{Smith et al.|[2015). Strong lensing can pro- 
vide ~1-—2% constraints on the mass enclosed within the 
Einstein radius from imaging data alone. To obtain a sim- 
ilarly precise mass constraint from dynamical observations 
at high redshift, either integration times longer by a factor 
O(10) are needed on current facilities, or we must wait for 
better quality adaptive optics systems planned for future ex- 
tremely large telescopes. 

Furthermore, the dynamical observables have their own 
intrinsic degeneracies, i.e., the mass—anisotropy degeneracy 
(e.g.,{Treu and Koopmans]|2002). There is an important com- 
plementarity between lensing and dynamical observables, 
which can be used to break their correspodning degenera- 
cies (e.g., (2014). Finally, gravitational de- 
flection of light depends only on the deflecting mass, so lens- 
ing responds to both baryonic as well as dark matter without 
any assumptions on their dynamical state, i.e. regardless of 
whether it is in equilibrium or not. 


2 Historical background 


In this section, we first provide a historical note on the initial 
discoveries of galaxy-scale strong lenses (Sect. [2.Tp. Then, 
in Sect. we briefly introduce a number of prominent 
samples of galaxy-scale lenses that have contributed to the 
major science applications described in Sect. [4]and Sect. [5] 
Additional references can be found in the review by 


(2010). 


2.1 Initial discoveries of strong lensing systems 


The first strong lens, the double quasar 0957+561A,B, was 
discovered in 1979 by {Walsh et al.|(1979). The system con- 
sists of two images of the quasar separated by 5”’7. The 
authors first offered a “conventional” interpretation that the 
two images are distinct, individual quasars that happen to be 
close to each other and share the same physical character- 
istics. Since no gravitational lens was known prior to that, 
their less conventional view was that the two are multiple 


images of the same source. Subsequent work showed that 
the lensing hypothesis was correct. For example, the struc- 
ture of the radio jets emanating from the two images of the 
quasar is consistent with them being mirror imaged, as one 
would expect for a minimum and a saddle image 


1988 1994). The discovery of 
the first quadruply imaged quasar, PG 1115+080, was an- 


nounced the following year 1980). It was 
initially called a “triple”, because the second arriving min- 
imum and its neighboring saddle point were too close to 
be resolved. The discovery of these two lenses opened up 
a brand new field in astrophysics: observations of multiply- 
imaged, “strong” gravitational lenses. Discoveries of other 
types of lensed sources followed: for example, the first dust 
obscured Seyfert 2 AGN, IRAS F10214+4724, was detected 
in the infra-red, and later identified as being lensed 
fhardt etal (1996) [Lehar and Broadhurst)/19%). 

The increasing number of detections of such lensed sys- 
tems spurred a lens modelling effort. The initial studies that 
used more detailed models beyond a point-like lens mass ap- 
peared in the early 1980's (Young et al.|/1980||198ibja), and 
already recognized “that there are several plausible ways to 
reproduce the observations”, foreshadowing the importance 
of lens model degeneracies. The role of the lens mass gran- 
ularity due to individual stars in the lensing galaxy was also 
recognized very early on 
[1981p, and later grew into the rich sub-field of extragalactic 
stellar microlensing (see Vernardos et al. (in preparation)). 


2.2 Prominent samples of galaxy-scale lenses 


In this subsection, we briefly introduce some of the most 
prominent samples of galaxy-scale lenses that had an impact 
on the science applications presented in Sect. /4]and Sect. 
Note that this is not a complete list of all the discovered 
galaxy-scale lenses. The specifics of lens searching and dis- 
covery to build samples like these are the topic of McMahon 
et al. (in preparation), where we refer the reader for recent 
developments in the search methods and discoveries. 


2.2.1 MG-VLA survey-based samples 


The first systematic search for strongly lensed systems at 
radio wavelengths took place in the eighties within the MG- 
VLA survey (Lawrence et al.||1986). This survey led to the 
discovery of a few famous radio-loud lensed quasars among 
thousands of radio sources scrutinized at high resolution with 
the Very Large Array (VLA). The Jodrell Bank-VLA As- 


tromtric Survey (JVAS; 1992 
1996), and its successor the Cosmic Lens All Sky Survey 


(CLASS; 1995) was the largest survey carried 


out for a long time. This survey targeted the whole north- 
ern sky (0° < Dec < 75°) for multiple images (separated by 


4 Shajib, Vernardos, Collett, Motta, Sluse, Williams, Saha, Spiniello, Birrer, Treu 


0’3 < A@ < 60) among flat spectrum radio sources brighter 
than 30 mJy. CLASS yielded the discovery of 22 new sys- 
tems, among which ten are doubles, nine are quads, and one 
displays 6 images (Browne et al.|/2003). 

The CfA-Arizona Space Telescope LEns Survey (CAS- 
TLES') is a follow-up Hubble Space Telescope (HST) imag- 
ing survey of ~100 galaxy-scale lenseq”]known at the time, 
some from previous surveys such as CLASS 
[1998). This survey collected the first uniform ensemble of 
high-resolution images of known galaxy-scale lens systems, 
including both galaxy—quasar and galaxy—galaxy lenses. An 
account of early systematic lens searches, which generally 
unveiled small samples of less than half a dozen systems, 


can be found in|Claeskens and Surdej| (2002). 


2.2.2 SDSS-based samples 


The Sloan Lens ACS (SLACS) survey discovered 85 galaxy— 
galaxy lenses from the Sloan Digital Sky Survey (SDSS) 
spectroscopic data and followed them up with multi-band 


HST imaging (Bolton et al.) |2006 2009). The 


sample was later expanded with 40 new systems having smaller [2018] {2020 2021 2021 
1.) {2021 


deflector masses — the SLACS for the Masses (S4TM) sam- 
ple (2015). In addition to galaxy—galaxy lenses, 
the SDSS Quasar Lens Search (SQLS) discovered a sample 
of 28 galaxy—quasar lenses using SDSS multicolor imaging 
data (Oguri et al.| 2006). More galaxy—quasar lens systems 
were discovered from joint SDSS and UKIRT Infrared Deep 
Sky Survey (UKIDSS) data by the Major UKIDSS-SDSS 


Cosmic Lens Survey (MUSCLES; 2012). 


2.2.3 CFHTLens-based samples 


The Strong Lensing Legacy Survey (SL2S) discovered a sam- 
ple of ~ 35 galaxy—galaxy lenses from the Canada-France- 

Hawaii Telescope Legacy Survey data (CFHTLens, |Gavazzi| 

jet al.| {2012). Some newer lens samples have also been dis- 

covered from the CFHTLens data (More et al.|/2016al |Parafidz 
etal 2016). 


2.2.4 BOSS-based samples 


The Baryon Oscillation Spectroscopic Survey Emission-Line 
Lens Survey (BELLS) discovered ~30 galaxy—galaxy sys- 


tems from the Baryon Oscillation Spectroscopic Survey (BOSS) 


and obtained HST imaging for them 2012). 


2.3 Other samples and ongoing efforts from recent surveys 


There are several other lens samples that have been recently 
discovered thanks to the numerous large-area sky surveys. 
The STRong-Lensing Insights into the Dark Energy Sur- 
vey (STRIDES) collaboration has discovered ~30 quadruply 
lensed quasar systems from the Dark Energy Survey (DES) 
data — often in combination with data from other large-area 
sky surveys — and obtained multi-band HST imaging of them 
in infra-red (IR), optical, and ultra-violet (UV) bands 
2029). A large number of 
galaxy—galaxy lens candidates have also been identified in 
the DES data 
(2022). In addition to the DES, 


surveys such as the Canada-France Imaging Survey (CFIS), 
Gaia, the Hyper Suprime-Cam (HSC) survey, the Kilo De- 
gree Survey (KiDS), and the Panoramic Survey Telescope 
and Rapid Response System (Pan-STARRS) have provided 
a plethora of newly discovered galaxy-scale strong lenses 


(e.g.,|Petrillo et al.}/2017||2019}|Agnello et al.}|2018} 
2018} [Delchambre et al.|/2018} 


[Lemon et al 2023} [Wong et al.][2029). 


of these samples still contain candidate lenses and require 
spectroscopic confirmation (by measuring the redshifts) and 
high-resolution imaging (to perform lens modelling) for the 
science applications described in Sect. [4] and Sect. |5| See 
McMahon et al. (in preparation)and the references therein 
for a detailed discussion on the ongoing efforts. 


3 Observables and analysis methods 


In this section, we describe the strong lensing observables 
(Sect. [3.Tp, and the analysis techniques to constrain galaxy 
properties from them. We describe the lens modelling meth- 
ods in Sect. and briefly discuss their common degen- 
eracies. We outline the Bayesian hierarchical framework for 

a sample of lens systems, which allows to infer population 
characteristics of galaxies from the sample in hand, in Sect.|3.4 
Lastly, additional non-lensing observables that are most com- 
monly combined with strong lensing observables are pre- 
sented in Sect. [3.5] 


3.1 Lensing observables 


This survey was later expanded into the BELLS for the GALaxy- Ty. two types of lensing observables for galaxy-scale lenses 


Lya EmitteR sYstems (BELLS GALLERY) survey, where 
the source galaxies are specifically Lya emitters 
2016). A sample of 13 strongly lensed quasars has also been 


discovered from the BOSS data 2016b). 
' https://lweb.cfa.harvard.edu/castles/ 


2? There are a handful cluster-scale lenses included in this sample as 
well. 


are imaging of the lensing system and the time delay be- 
tween different point-source images. 


3.1.1 Imaging of the lens system 


The most common and informative lensing observables re- 
sult from imaging data with angular resolution much better 
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SDSS J0737+3216 


SDSS J1205+4910 


ATLAS J0259-1635 


DES J0405-3308 iPTF l6geu 


Fig. 1 Examples of galaxy-scale lenses with different types of background sources. These images are false color images created from multi-band 
HST imaging. The first 2 panels show two SLACS systems, where the background source is a star-forming galaxy without a central AGN. Image 
credit: NASA/ESA, A. Bolton, and the SLACS team. The next 2 panels show quadruply lensed quasars discovered by the STRIDES collaboration 


(Shajib et al.|/2019). The fifth panel shows a quadruply imaged supernova (Dhawan et al.|!2020). For the quasar and supernova examples, lensed 


arcs or a complete Einstein ring from their extended host galaxy are also noticeable. 


than the Einstein radius. An extended source can be lensed 
into clearly identifiable arcs that can form partial or com- 
plete Einstein rings (see Fig. [ip. The conjugate points on 
the arcs, i.e. locations that are traced back to the same lo- 
cation in the source plane, can be used to simultaneously 
constrain a lens model (through its deflection angles and 
magnification) and reconstruct the surface brightness of the 
source that is a priori unknown. Although it is possible to 
constrain simple models from any kind of image that dis- 
plays lensing features with sufficient signal-to-noise ratio, 
high resolution imaging from space- or ground-based tele- 
scopes offers many more observational constraints (i.e., con- 
jugate pixels). This is crucial for exploring more sophisti- 
cated models, which are required for precise science appli- 
cations (see Sect.|4]and Sect. [5p. 

If the background source contains a point-like emitting 
region — e.g., a quasar or supernova — the positions of its 
multiple images can be extracted from the data and further 
used as constraints for a lens model (i.e., they are conjugate 
points). Although the simple addition of 2 or 4 conjugate 
pixels may initially seem insignificant compared to the hun- 
dreds of pixels that correspond to the extended source, the 
flux contained in these pixels may be brighter than the en- 
tire host galaxy, e.g., in the case of an AGN. Hence, these 
conjugate points have a much stronger effect on the result- 
ing model and need to be treated separately. In addition to 
astrometry of the point-like images, their magnification ra- 
tio (or flux ratio) can be used as an observational constraint. 
However, image magnifications are susceptible to micro- and 
milli-lensing, dust extinctior|’| and the effect of higher or- 
der moments in the mass distribution of the lens usually 
attributed to the complex, non-linear physics of baryons, 
e.g. galactic disks, bars [2016] 2017). There- 
fore, many lensing analyses involving systems with such 
point-like sources choose not to use magnification ratios as 


constraints (e.g., |Shajib et al.) |2019). However, if the mi- 


crolensing and dust extinction effects can be incorporated 


3 Lensed arcs from extended sources are susceptible to dust extinc- 
tion as well. 


and quantified within a lens model, then any residual flux 
ratio anomaly would signal a departure from the smooth 
macro-model for the deflector galaxy and thus could indi- 
cate the presence of dark subhalos within the main halo (e.g., 


Mao and Schneider} |1998| [Nierenberg et al.| {2017 
2020). Such detection of dark subhalos can provide 


important insights into the nature of the dark matter proper- 
ties, as presented in Vegetti et al. (in preparation). 


3.1.2 Time delay 


If the background source is a variable point source, e.g., a 
quasar or supernova, the delay between the arrival times of 
photons at its different multiple images can be measured 
through long-term monitoring that spans from a few sea- 


sons to decades (e.g.,|Eigenbrod et al.| |2005} {Bonvin et al.| 
[2016}/Millon et al.|{2020). The time delays are variant under 
the well-known mass-sheet degeneracy (MSD) in lensing, 
unlike the imaging observables 
(2014). Thus, they can be combined with the 


imaging observables to break the MSD when constraining 
the potential of the lensing galaxy. However, such a combi- 
nation of these observables requires assuming a fiducial cos- 
mological model, since the time delays depend on the cos- 
mology and in particular on the Hubble constant. A more 
detailed discussion on the measurement of time delays is 
provided in Birrer et al. (in preparation). 


3.2 Lens modelling methods 


Lens modelling is the process of constraining properties of 
the lens galaxy and the source from the lensing observables. 
Traditional methods are based on reconstructing the source 
light and lens potential to fit the data under some assump- 
tions such as regularization, and are described in Sect.|3.2.1 
More recently, machine learning methods have been devel- 
oped for these purposes, presented in Sect. [3-2-2] which pro- 
vide faster models with data-driven instead of ad-hoc as- 
sumptions. 
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3.2.1 Likelihood-based inference 


Such methods require a likelihood function to be maximized 
that leads to an optimized model, able to reproduce an ob- 
served multiply-imaged system down to the noise level. In 
general, the model consists of three main components: the 
background source’s flux distribution, the mass distribution 
in the lensing galaxy (or galaxies), and the flux distribution 
in the lensing galaxy (or galaxies). The likelihood function 
that measures the goodness-of-fit of the model to the data 
can be defined as 


1 
L(d | m) « exp 5 (d-m)’ Z;' (d—m)|, (1) 


where d is the vector of pixel values in the data, 


m= M(Emass> Esources light) (2) 


is the model’s estimate of the flux in the data pixels, where 
Emass 1S the set of model parameters defining the lens mass 
distribution, €source is the set of model parameters defining 
the source’s flux distribution, &jgn is the set of model pa- 
rameters defining the lens galaxy’s flux distribution, and 24 
is the covariance matrix of the data. Some studies choose to 
subtract the lens galaxy’s flux distribution from the imaging 
data before optimizing a lens model without it (e.g., [Bolton] 
(2008). Given that the lens flux is loosely related to 
the lensing phenomenon — only through mass-follows-light 
arguments that are not strictly required — we omit it in the 
following for brevity. This likelihood function can be ex- 
tended in a Bayesian framework to include prior (regular- 


ization) terms on the source 
BOOS) or used to con 
pute the Bayesian evidence 
and perform model comparisons. 


When the lensed arcs from an extended source are re- 
solved, each pixel serves as a constraint for the lens model. 
Given a deflection field, @(@), which depends on the lens 
mass distribution through &),,;;, the lens equation 


B(O) = 6—- a) (3) 


can be used to map any position on the image plane, @, to 
the corresponding one on the source plane, B. The lensed 
flux at any location on the image plane can then be easily 
calculated as 


1(0) = S{BO)], (4) 


where S is the light distribution of the source that depends 
On €,ource, and we use the fact that lensing conserves surface 
brightness. The dependence of /(@) on the mass is almost al- 
ways non-linear, which means that we cannot directly solve 
Eq. B]to obtain the true parameters &mass, even when S (and 
therefore Emass) 18 perfectly known. 


In practice, S is an unknown that needs to be solved for 
simultaneously with the lens potential and requires special 
attention. The simplest choice for it is an analytic function, 
e.g., a Sérsic profile [1968), whose parameters are 
treated in the same forward-modelling way as for the lens 
potential. However, a number of more advanced techniques 
have been developed over the years that allow a free-form 
reconstruction of the source, each with its own advantages 
and disadvantages. The semi-linear inversion technique of 
uses a regular grid of pixels to ap- 
proximate the source brightness. Although the degrees of 
freedom are much higher in this case, the use of priors ex- 
plicitly selected to correspond to quadratic terms in the like- 
lihood allows the direct computation of the best-fit source 
for given &mass (hence the name ‘inverse’ as opposed to ‘for- 
ward’ modelling). This actually turns the source reconstruc- 
tion into a linear inversion problem for a given set of non- 
linear parameters within {€mass, €source}, Where €source refers 
only to the ‘hyper-parameters’ of the source that are not 
solved through the linear inversion. 


This linear inversion problem requires some form of reg- 
ularization of the source (i.e., prior) to be solved. Typical 
choices of source regularization are gradient and curvature 
that impose smoothness on the source solution through its 
derivatives (a standard approach in this kind of problems, 


e.g. 1992). Alternative choices are covariance 
kernel priors (Vernardos and Koopmans} |2022), which are 


based on observations of the galaxy brightness power spec- 
trum and are thus a more physically justifiable choice, or 
multi-scale regularization through the use of sparsity con- 
straints on a wavelet representation of the source 
jet al.|[2021). In fact, choosing a different basis to represent 
the source can in itself significantly reduce the number of de- 
grees of freedom while still having enough flexibility to rep- 
resent complex light profiles across different scales 
see also Fig. B). This 
can be similarly achieved by reconstructing the source on an 
irregular, adaptive grid that can have increased resolution in 
the most magnified areas of the source, i.e., near the caustics 


(Vegetti and Koopmans} |2009} |Nightingale and Dye} |2015 
Vernardos and Koopmans}|2022). Fig. 2|illustrates an exam- 


ple of lens modelling based on high-resolution HST imaging 
of a lensing system with a bright, extended source compo- 
nent. The source is reconstructed on an adaptive grid using 


curvature regularization (see Chapter 4 of 2021). 


If there is a point source within the extended source galaxy, 
then the positions of its point-like images provide additional 
constraints on the mass model as conjugate points. One ap- 
proach to model the point-like source is to include the lo- 
cation of its multiple images as free parameters and then 
require the mass model to trace them back to the same loca- 
tion on the source plane through the lens equation. Inversely, 
the point-source location can be treated as free and the pre- 
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dicted image positions are required to match the observed 
ones. In practice, the choice between the two approaches is 
based on computational and sub-grid effect arguments (see 
for a more detailed presentation on the topic). 
Fig. |3| illustrates an example of lens modelling based on 
high-resolution HST imaging of a lensing system with both 
an extended and a point-like source component. In this case, 
the model for the extended source brightness (the quasar 
host galaxy) assumes an analytic Sérsic profile and an ad- 
ditional free-form component described by a basis set of 


shapelets 2015} |Shajib et al.|{2019). 
3.2.2 Machine-learning based parameter extraction 


Recently, machine learning (ML) based methods have been 
developed to extract lensing parameters — e.g., the Einstein 
radius, power-law index, ellipticity, and shear parameters — 


from the imaging data (e.g., 2017 
2019 2022||Schuldt et al-|[2022). 


In these approaches, a machine learning algorithm — usu- 
ally a neural network — is trained using synthetic data, since 
real examples of lens systems are not adequate in number 
for such data-intensive training. This leads to the important 
caveat that the mass density profile in the simulated galax- 


ies is based on empirical priors (e.g., 2017) 
or cosmological simulations (e.g., 2022). Thus, 


the inferred parameters are prior-dependent — either empiri- 
cal or physical — in a similar manner that the forward mod- 
elling approach depends on the adopted mass model and the 
associated priors. 

Although ML-based inferences are yet to demonstrate 
their potential on science applications with real data, they 
are essential for quickly extracting lensing parameters with 
minimum human supervision. This is crucial for modelling 
very large samples of lenses like those expected from several 


large surveys upcoming in the 2020’s (e.g., 2021 
Wagner-Carena et al.|/2021). For such large datasets, the tra- 


ditional forward modelling techniques would be unfeasible 
due to either computational or human time restrictions. Fi- 
nally, ML-based inference provides a direct and fast way 
to constrain quantities of interest that can be otherwise too 
cumbersome to infer from the data through a traditional ap- 
proach, e.g., detecting individual or populations of dark sub- 
halos, or constraining the subhalo-mass function 


note that these particular science questions are discussed in 
detail in Vegetti et al. (in preparation)). 


3.3 Lens mass models 
Lens mass models can vary in complexity and eventually 


in the number of free parameters, reflecting the fact that 
the mass distribution in galaxies is a non-trivial problem, 


for which methodologies and algorithms are still evolving. 
Although it is possible to extract lensing information with 
free-form models and directly connect them to galaxy prop- 
erties of interest (see Saha et al. (in preparation)), lens mod- 
elling with simple parametrization has been the most com- 


mon practice in the literature (e.g.,|Ritondale et al.| |2019 


Schmidt et al.|/2022). We briefly introduce some commonly 
used models with simple parametrization in Sect. dis- 


cuss some degeneracies that impact them in Sect. and 
describe free-form models in Sect. and model ensem- 
bles in Sect. 


3.3.1 Simply parametrized mass models 


Saha et al. (in preparation) provides an in-depth discussion 
about several popular mass models with simple parametriza- 
tion, e.g., the singular isothermal sphere/ellipsoid (SIS/SIE), 
the softened power-law mass distribution (SPEMD; 


1993 1998), and the Navarro— 
Frenk—White (NFW) profile 1996||1997). It 


is also often necessary to include a constant external shear 
(XS) lensing potential in addition to the primary lens mass 
distribution to accurately model the distortions in the lensed 
arcs or the Einstein ring arising from nearby perturbers and 
large-scale structures [1997). Until the mid- 
2010's, the SIE+ XS model has been the most popular choice 
to model large samples of galaxy-scale lenses (e.g., {Bolton| 
(2013), adequate for the re- 


quired science applications at the time. Although simple SIE 
models can sufficiently constrain the Einstein radius, ob- 
taining other important properties, such as the radial slope 
of the mass profile, requires models with additional degrees 
of freedom. Improved data quality and analysis techniques 
have allowed the use of such models (e.g.,|Ritondale et al. 
(2021). 

Simply parametrized lens models with larger degrees of 
freedom, e.g., a superposition of a stellar component with a 
constant/varying mass-to-light ratio and a NFW profile for 
the dark matter, have been adopted by some studies (e.g., 
[Trew etal |2010}|Sonnenfetd etal, [2015] Shajibetal/2021), 
Although necessary for the addressed science questions, such 
mass profiles with more free parameters amplify the impact 
of degeneracies inherent to lensing. It is thus often neces- 
sary to constrain these additional parameters by incorporat- 
ing non-strong-lensing observables such as stellar kinemat- 
ics and weak lensing, or by incorporating informative priors 
(eg.,.Sonnenfeld] 2018] Shajid et al][2021) 

While the additional degrees of freedom beyond the sim- 
ple SIE model focus on the radial profile of the lens in most 
cases, azimuthal structures such as discyness, boxyness, el- 
lipticity gradients, or isodensity twists may leave noticeable 
imprints in the lensed images. For point-like sources, the 
image flux ratios are the most susceptible to perturbations 
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Fig. 2. First panel: False-color RGB image combining three HST filters (Sonnenfeld et al.||2012). Second panel: HST image in the F814W filter 


of the system with the galaxy light subtracted. The lensed arcs from the source galaxies at different redshifts are indicated in solid and dashed 
contours. Third panel: Model of the lensed arc from only the brightest source. Fourth panel: Corresponding source reconstructed on an adaptive 


grid using every data pixel and curvarture regularization [see Chapter 4 of [Bayer| (2021) for the analysis of the data and|Vernardos and Koopmans 
(2022) for the modelling method]. 


DES J2038-4008 Reconstruction Reconstructed source 


by (arcsec) 


e@ 


6x (arcsec ) 


Fig. 3. Example of modelling a galaxy-scale lens system - here the lensed quasar DES J2038—4008. First panel: The false-color image of the 
system from the DES data, where only the point images are resolved (Agnello et al.|2018). Second panel: A lens model based only on the positions 
of the quasar images (Agnello et al.|[2018). The dashed lines show the contours of the time-delay surface, whose extrema define the locations where 
images of the point source appear. Third panel: The false-color image of the system taken by the Hubble Space Telescope (HST), where the lensed 
arcs of the quasar host galaxy can be seen in great detail. Fourth panel: The best-fit model of the HST data from|Shajib et al.](2019). Fifth panel: 
The corresponding reconstructed quasar host galaxy (without the central quasar) in the F160W band. The blue star marks the position of the central 
quasar. 


(Miller et al.| {2003 2003] |2005), while for = where A is a constant. This equation implies a source posi- 


extended sources the imprint is generally more subtle and _ tion transformation (e.g.,|Schneider and Sluse}|2014), where 
detectable only from high-resolution imaging data the unknown source position is altered as 


2022bja). Fortunately, most of those struc- 


tures arise from baryonic physics and may also be detectable BoB =A. (6) 
in the luminosity profile of the lens (Van de Vyvere et al. 
2022b). A more general but approximate degeneracy happens when 


A is not a constant anymore but depends on the position, i.e., 
a= 2(@) (Unruh et al] 2017} [Wertz et al.][2018). 

A particular choice of the mass profile, e.g., the power- 
law mass profile, artificially limits the MSD, as the MST of 
a power law is mathematically not a power law anymore. 
However, it is a common practice in the literature to as- 
sume such a mass profile in galaxy-scale lens modelling, 
since numerous non-lensing constraints on the mass profile 
of elliptical galaxies demonstrate that this is a good approx- 


imation (e.g. 2007} [Tortora etal [2074 
2018). The strong lensing imaging observables 


can only constrain in a model-independent way 6, and the 
MST-invariant quantity 


3.3.2 Common degeneracies in simple parametric 
modelling 


The degeneracies in lens modelling — both intrinsic in the 
data and stemming from the parametrization scheme — are 
discussed in detail in Saha et al. (in preparation). Here, we 
summarize the common degeneracies that largely impact sim- 
ply parametrized lens models for the convenience of the 
readers. 

The MSD, which is intrinsic to imaging observables in 
lensing, originates from the mass-sheet transform (MST, [Faldo 


Opa” (Og) 
pageee 7% (5) Std © 7 gy (7) 
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where a’”’ is the second derivative of the deflection angle 


[see Equation 42 of for the full definition of 
éaq, also [Kochanek| (2021)]. For a power-law convergence 
profile x(@) « @-7"-*!, this quantity becomes é,44 = yp — 2. 
As aresult, the choice of a SIE+XS lens model fixes &ag = 0 
and only extracts 6g from the imaging data. 

Degeneracies in lens modelling can also arise from par- 
ticular parametrizations of the lens model. One such ex- 
ample is the shear—ellipticity degeneracy, as the total shear 
can be redistributed between the “internal” shear, arising 
from the ellipticity of the central deflector, and the external 


shear (Kassiola and Kovner}\|1993). Specifically, a SIE+XS 


model can be modelled with only a SIE model that has a 


quadrupole moment equaling to 1/3rd of the shear 2005). 
This degeneracy is particularly apparent when modelling with 


point-like image positions as the only constraints 1996). 


3.3.3 Free-form models 


In the most commonly used lens models, the main focus is 
on capturing the radial shape of the mass profile, while any 
azimuthal structure beyond an elliptical shape with external 
shear is of secondary importance. Real galaxies can obvi- 
ously have more complicated mass profiles, with higher or- 
der moments present, such as discyness, boxyness or bar- 


or disc-like structures (e.g. 
tal 2006 seh etal, 201 7)Frigo et al, [2013) radia de 
pendence of the ellipticity or the orientation of the isoden- 
sity contours (gradients and twists, e.g., 


and Williams} |2018}|Nightingale et al.| 2019} |Shajib et al 
2021), or even features that do not fit into a simple para- 


metric description, such as merger products that are not yet 
completely relaxed and populations of substructures (e.g., 
satellite subhalos or perturbers along the line of sight). De- 
tecting such deviations from the simple parametric profiles 
depends on numerous factors, such as their alignment with 


the smooth potential (Van de Vyvere et al.|!2022b), the com- 
plexity of the brightness profile of the source 
2022), the signal-to-noise ratio, etc. How- 


ever, multipole components in the mass potential beyond the 
combined effect of ellipticity and external shear have been 
recently detected in very-long-base-line interferometric ob- 
servations of a strong lens (2022). Although 
not accounting for such structures can bias the mass model 
by up to several percent — which is mostly acceptable except 
for time-delay cosmography (see Birrer et al. (in prepara- 
tion)) and for detecting dark matter substructure (see Vegetti 
et al. (in preparation)) — their detection holds valuable infor- 
mation on the formation history and evolution of galaxies. 
To this extent, free-form techniques have been devel- 
oped that either entirely dismiss any parametric mass com- 
ponent and employ a grid of mass pixels to describe the 


lens potential (Saha and Williams} |1997), or retain a para- 


metric model as a first-order smooth component and com- 
bine it with a similar pixel grid that now focuses specifically 
on capturing higher order deviations (2005). In 
both cases, regularization (prior) assumptions on the free- 
form pixel grid are necessary in order, first, to even obtain a 
solution, and, second, to prevent the appearance of unphysi- 
cal mass distributions. Existing techniques are based on for- 
ward models or extensions of the semi-linear inversion tech- 
nique (Vernardos and Koopmans] 2022). 

The specific form of the regularization priors plays an 
important role on the quality of the obtained solutions. It 
has been shown that purely mathematically motivated pri- 
ors (e.g., curvature) can lead to biased potentials as opposed 
to more physically driven ones, based on the observed light 


properties of real galaxies (e.g. |Vernardos and Koopmans 
2022). Recently, (2022) have proposed a wavelet- 


based regularization technique that finds solutions that sat- 
isfy sparsity constraints, and Baggio et al. (in preparation) 
completely replaced the pixel grid by a neural network. These 
approaches allow the use of purely data-driven regulariza- 
tion that is the most compatible with the data (without com- 
puting and comparing the Bayesian evidence). It remains to 
be seen how well these new and promising techniques can 
perform on real data and robustly recover deviations from 
smooth, parametric models that encode galaxy evolution. 


3.3.4 Model ensembles 


Whereas the goal of most modelling done at present in the 
literature, including in free-form models as discussed above, 
is to find the single best fitting mass model with associ- 
ated statistical uncertainties, an alternative approach aims 
to find an ensemble of models, which explores the degener- 
ate “model-space” consistent with the data. In this approach, 
there is no single optimal model and no need for regulariza- 
tion. However, a prior is still necessary, but it is less infor- 
mative. 

Such models ensembles were first introduced for free- 


form models made up of mass tiles or pixels (Saha and Williams 
2004 2014). In these and related works, the 


mass distribution is required to be non-negative and cen- 
trally concentrated in a broad sense, and given these hard 
constraints, the prior probability is uniform in an abstract 
space of pixels. From this prior, models that reproduce the 
observed positions of point-like image features correctly are 
sampled in random-walk fashion to form an ensemble. The 
ensemble can be further filtered according to how well im- 
age data can be fitted. For an example, see Fig. [4] adopted 
from [Denzel et al.[202T). 

Free-form modeling is not, however, the only way to 
generate lens model ensembles; parametric models, with the 
number of parameters somewhat exceeding the number of 
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Fig. 4 An example of model ensemble for the lensing system SWOS5 
J1434+5228, showing the cumulative mass profiles from the lens 
model (purple) and from the MassiveFIRE simulation (Al, A4, Cl, 


D7;|Feldmann et al.||2016||2017). The solid lines show the total en- 


closed mass and the dashed lines show the enclosed stellar mass. The 
shaded regions encompass 99.7% confidence region. The models are 
best constrained at the image positions marked by vertical dashed lines. 


data constraints have also been used recently (Williams and 
2020} [Barrera et al|[2021). 


3.4 Bayesian hierarchical framework 


The Bayesian hierarchical framework can be used to con- 
strain the population properties from a sample of strong lenses 
(e.g., [Sonnenfeld et al] [2015). This framework also allows 
one to incorporate a selection function of the lensing galax- 
ies and generalize the sample properties to the population 
of all galaxies that are of the same type as the lensing ones 
(e.g., [Sonnenfeld et al.|[2019). Within the hierarchical anal- 
ysis, there are two levels of parameters: hyper-parameters 
that dictate the distribution of the parent population of the 
lens galaxies, and parameters pertaining to individual lens 
galaxies sampled from the parent population. The hierarchi- 
cal framework connects the “top’-level population hyper- 
parameters to the observed data of individual galaxies through 
the “bottom’’-level parameters. According to the Bayes’ the- 


galaxies, the above equation can be expressed as 


pw | D)« pw) | { dwip(D; | wi)p(wi |v), (9) 


where D; is the data from to the i-th individual lens galaxy. 
In this expression, the likelihood function p(D; | w;,) for 
each individual galaxy can be more easily computed than 
the top-level likelihood function p(D | v). This approach can 
be applied to infer any property at the population level with 
the associated mean and scatter values. See, for example, 
for a detailed presentation of 
the hierarchical framework with specific examples of hyper- 
parameters v. 


3.5 Incorporating non-strong-lensing observables 


Incorporating non-strong-lensing observables can be used to 
break the degeneracies in strong lensing analysis. For exam- 
ple, stellar kinematics data can be used to break the MSD 
(e.g., |Romanowsky and Kochanek\ |1999 


2002} |2004} |Shajib et al.||2018} |Birrer et al.| |2020 
Shajib et al.||2021), whereas spectroscopic stellar popula- 


my 


2021), and microlensing (e.g.,|Schechter et al.||2014}|Oguri 
et al.| (2014) information can help mitigate the degenerac 
between the stellar and dark matter distributions. 


< 


3.5.1 Combining stellar kinematics with strong lensing 


Imaging observables probe the mass distribution of the lens 
projected on the plane of the sky, whereas stellar kinemat- 
ics probe its full three-dimensional mass distribution. Thus, 
a combination of the two helps break the MSD to robustly 
constrain the mass distribution in galaxies. Although ellip- 
tical galaxies are triaxial, assuming spherical symmetry has 
been a standard practice for the case of a single aperture- 
integrated stellar velocity dispersion measurements (see[Son 
for a discussion on the impact of this as- 
sumption). Then, the stellar velocity dispersion is obtained 
by solving the spherical Jeans equation 


d (U(r) 2?) _ Bail) 0? 
r 


d® 
= —l(r) —. 
dr ol dr 


(10) 


orem, the posterior probability distribution of the hyper-parameteFfere, /(r) is the three-dimensional luminosity density of the 


v is given by 
p(v| D) x p(D | v)p), (8) 


where D is the dataset, p(D | v) is the likelihood, and p(v) is 
the prior. For model parameters w; pertaining to individual 


stars, 0, is the intrinsic radial velocity dispersion, and Bani(r) 
is the anisotropy parameter relating o, with the tangential 
velocity dispersion o, given by 


_;_% 
Bani() =l1- ey 


I 


(11) 
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By solving the Jeans equation, the line-of-sight velocity dis- 
persion, which is the kinematic observable, is obtained as 


dp 2G (Page (TV) MO) 
Tios®) = TRY J, (5) Pal 


(Mamon and Lokas}|2005). Here, M(r) is the enclosed mass 


within radius r. The function Kg(u) depends on the parame- 
terization of B(r) [see|Mamon and Lokas] (2005) for specific 
forms of Kg(u) corresponding to different 6(r)]. Thus, the 
observed velocity dispersion in Eq. [12] can be written as a 
function of the lens model parameters as 


(12) 


D 
ins = Da e JE mass: Sight, Bani A), 
s 


where €ass are the deflector’s mass model parameters, ight 
are the deflector’s light model parameters. In this form, the 
function J is independent of cosmology and it only depends 
on the lens model parameters {Emass, €light}, the anisotropy 
profile Bani(r), and the MST parameter A. All the cosmo- 
logical dependence of oj, is contained in the distance ra- 
tio Dx/ Das (Birrer et al.| 2016). To reproduce the observed 
velocity dispersion integrated within an aperture, the com- 
puted luminosity-weighted velocity dispersion needs to be 
blurred with the seeing S by performing the convolution 


(13) 
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where the * symbol denotes the convolution operation. For 
analytic solutions with specific choices for mass, light, and 
anisotropy profiles, see (2004) for the case of 
power-law mass and light profiles with constant anisotropy, 
and (2010) for the case with power-law mass 
profile, the Hernquist light profile [1990), and 


isotropic stellar orbits. 

The constraints from the velocity dispersion measure- 
ment can be folded in the lens model posterior with a multi- 
plicative likelihood term 


ap 
2 2 
Cate 


(5) 


(os - ormodel 2 
a 
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where C raps is the uncertainty in the observed velocity dis- 


persion ee Whereas the imaging observables from strong 


lensing cannot constrain the MST parameter A, the stellar 
velocity dispersion constrains A through the likelihood Lyin 
(see Birrer et al. (in preparation)for a detailed discussion 
within a hierarchical Bayesian framework). 

Although integrated velocity dispersion from single slit 
spectra is the most commonly used kinematic observable in 
strong lensing studies, a few studies have also incorporated 
spatially resolved velocity dispersion, mainly from IFU spec- 


tra (e.g.,|Barnabe and Koopmans}|2007 20111 
2012}|Spiniello et al.)/2015). 


3.5.2 Combining weak lensing with strong lensing 


Weak lensing measures the excess shear quantity 42’ from 
tidal distortions of galaxies that are far away (= 10 arcsec) 
from the central lensing galaxy. Thus, weak lensing provides 
information on the mass distribution of the lensing galaxy’s 
outer region, i.e. where the dark matter halo dominates. As 
a result, incorporating the weak lensing information allows 
constraining the normalization of a dark matter component 
in the lens mass model. Strong lensing observables provide 
information on the total mass distribution (e.g., the enclosed 
mass), but there remains a degeneracy between the luminous 
and dark matter fractions within the total enclosed mass. If 
a specific profile is assumed for the dark matter distribu- 
tion, weak lensing data can break this degeneracy between 
luminous and dark component normalizations 
Shajib et al.|[2021). 

Formally, the weak lensing information needs to be in- 
corporated within the hierarchical framework. Since the weak 
lensing signal from one single lensing galaxy does not usu- 
ally have enough constraining power on the dark matter nor- 
malization, it is often required to stack weak lensing signals 
from a large sample of elliptical galaxies. If it is justified to 
assume that this large sample of elliptical galaxies and the 
lens galaxy sample under consideration are subsamples of 
the same parent population, then the weak lensing observ- 
ables Dweak and the strong lensing observables Dytrong can 
be jointly considered in the posterior probability function of 
the population hyper-parameters v as 
piv | Dstrong: Dyweak) x P(Dstrong | V) P(Dweak | v)p(v). (16) 
Here, the strong lensing likelihood p(Dgtrong | v) can then be 
expanded using the single system likelihoods similar to the 
product term in Eq.[9} Since we have from Bayes’ theorem 
P(Dweak | v)P(v) & Pv | Deak), (17) 
it is also valid for numerical convenience to first obtain the 
posterior of v from weak lensing observables only, and then 
fold this posterior as the prior of v in the hierarchical analy- 
sis of strong lensing data only. 


4 Applications in galaxy properties and evolution 


In this section, we describe what we can learn about galaxy 
structure and evolution using the the lensing galaxy proper- 
ties. Since strong lensing galaxies are typically massive el- 
lipticals, most of the strong-lensing studies in the field relate 
to this type of galaxies (Sect. [4-i} Sect. 4.4). However, we 
briefly discuss strong lensing by spiral galaxies at the end of 
this section in Sect. [4.5] 
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4.1 Galaxy mass density profile 


All galaxies are believed to form and grow inside their own 
dark matter halos. Thus, the total mass density profile of a 
massive galaxy is composed of two components: the bary- 
onic matter distribution that includes stars and gas, and the 
dark matter distribution. As seen from the comparison of the 
observed galaxy luminosity function and the distribution of 
simulated dark-matter halos (i.e., abundance matching, see 
[Moster et al.|/2010), the stellar-mass fraction decreases with 
mass, for the mass range of typical lensing galaxies. This is 
attributed to AGN feedback and is evident from simple com- 
parisons of stellar and total mass in lensing galaxies 
etal 2009] Kiing et al, 2078). 

It is observed from several lensing and non-lensing ob- 
servables that the total density profile of massive galaxies is 
well approximated by a power law p(r) « r~”?' that is close 
to the isothermal profile, i.e., yp} = 2, e.g. from strong lens- 


ing only or in combination with stellar dynamics (Kochanek' 


1995} |Treu and Koopmans} |2004} |Dye and Warren| |2005 
Koopmans et al.| |2006} |2009} |Barnabe et al.| |2009} |Auger 
2 


.,/2010} |Dutton and Treu} |2014}|Ritondale et al. 
Powell et al.|/2022), from combination of strong and weak 
lensing (Gavazzi et al.) |2007), from stellar dynamics only 


(Bertin and Stiavelli| {1993} [Gerhard et al.| 2001} 


-|[2007} Tortora et al.|2014} [Bellstedt et al.|[2018}/Derkerne 


(2021), and from X-ray luminosity 
[Buote]/2010). This phenomenon - that the total mass profile 
in ellipticals approximately follows the power law, whereas 
neither the baryonic not the dark matter component individ- 
ually follows the power law — is referred to as the “bulge— 


halo conspiracy” (Treu and Koopmans||2004), similar to the 
“disk—-halo conspiracy” in spiral galaxies (van Albada and 
1986). Fig. [5]illustrates the power-law mass pro- 


file for seven SLACS lens galaxies constrained from lens- 
ing imaging data, and also the individual dark and luminous 
components that were constrained by combining kinematic 
and weak lensing information with strong lensing 
fet al.|/2021). In most cases, the total density profile is very 
close to a power-law form with deviations only being promi- 
nent far from the half-light, or effective, radius. From galaxy 
formation simulations, the close-to-isothermal nature of the 
total density profile are thought to originate from rearrang- 
ing the mass distribution through collisionless accretion in 


gas-poor mergers (Johansson et al.|[2009}/Remus et al.|/2013). 


Analysis of the SLACS lenses finds the mean logarith- 
mic slope (yp) = 2.08 + 0.03 with an intrinsic scatter of 
0.16+0.02 from a sample of 85 galaxy—galaxy lenses 
(2010). The median redshift 
of the SLACS lenses is (zs_acs) ~ 0.19. In this analysis, the 
SIE lens model was adopted to constraint 6g from the imag- 
ing data. Then, the power-law index yp) was obtained from 
the stellar velocity dispersion measured by the SDSS.|Shajib 


(2021) re-analyze 23 systems from the SLACS sample 
with a power-law model instead of the SIE model, and ob- 


tained the mean logarithmic slope (yp) = 2.08 + 0.03 with 
an intrinsic scatter of 0.13+0.02.|Ritondale et al.|(2019) ana- 
lyze 17 galaxy—galaxy lens systems from the BELLS sample 
with a power-law mass model to find the average logarithmic 
slope (Yp1) = 2.00 + 0.01. The mean redshift of the BELLS 
sample is approximately (zpetis) ~ 0.5. 

The surface mass in elliptical galaxies constrained from 
strong lensing can explain the origin of the so-called tilt of 
the fundamental plane, i.e., the tight correlation between the 
effective radius, the effective surface brightness, and the cen- 
tral velocity dispersion 1996). If the surface 
mass is used instead of the surface brightness, then the re- 
sulting mass plane (in place of the fundamental plane) is 
not tilted (2008). This result implies that the 
tilt of the fundamental plane stems from the increase in the 
dark matter fraction with increasing velocity dispersion or 
dynamical mass (Auger et al.||2009). 

In summary, the total mass profile in galaxies seems to 
be well described by a power law. The implications of this 
finding for the individual baryonic and dark matter compo- 
nents are discussed separately below. 


4.1.1 Luminous (baryonic) mass profile 


In recent years spatially resolved spectroscopic surveys of 
nearby ellipticals using integral field units (IFUs), and high 
resolution HST imaging have considerably advanced our un- 
derstanding of the structure and evolution of these galax- 
ies (Cappellari}/2016). Such detailed studies of intermediate 
redshift galaxies (z ~ 0.2 — 0.7) using direct observations 
are not possible. However, the mass of galaxies acting as 
strong lenses at these redshifts can be mapped out in detail, 
thereby providing critical information about the progenitors 
of present day galaxies and hence galaxy evolution. 

In the last decade, imaging and kinematic data has led 
to a revision of galaxy classification. The modern analyses 
classify ellipticals into fast and slow rotators, and those with 
and without central cores, resulting in a total of 4 classes 


(Cappellari} |2016 2020). Massive ellip- 


ticals (M = 10!''Mo) in the local Universe tend to have 
small cores with size 0.02—0.50 kpc (Krajnovié et al.|/2020), 
i.e., much smaller than the Einstein radius of typical lenses, 
while less massive ones appear to have cuspy central light 
profiles. Cores are believed to be the result of in-spiraling 
super-massive black hole (SMBH) binaries, which transfer 
their angular momentum outward to stars and leave a flatter 
density core at the center. The population of massive ellip- 
tical galaxies is also more likely to be composed of slow 
rotators, being also morphologicaly rounder than the lower 


mass fast rotator counter-part (e.g., 2014 
van de Sande et al.}/2017). 
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Fig. 5 3D mass distribution in dark matter and baryonic matter in lensing elliptical galaxies constrained by 2021). The power law 
mass distribution (orange) refers to imaging-data-only-based constraint that used a power-law lens model. (2021) combined strong 
lensing imaging with the stellar kinematics and weak lensing information to decouple baryonic (blue) and dark matter (grey) contributions to the 
total mass density profile (purple). In most cases, the total density profile is close to the power-law profile with occasional deviations appearing 


near the center or far outside of the effective radius Rep. 


Although it is well known that the baryons and the dark 
matter do not follow the same radial profile, it is not known 
a priori whether they have the same angular structure or not. 
Strong lensing allows to compare the azimuthal distributions 
of the light and the total mass distributions, and thus to de- 
tect any possible difference between the angular structures 
of the dark matter and the baryons. A number of studies re- 
port strong correlation in the ellipticity between the matter 


and light distributions (e.g.,/Koopmans}|2006}{Gavazzi et al.| 
2012|[Sluse etal. 2012|[Kostrzewa-Rutkowska et al,/2014) 
whereas some other studies only report weak or no correla- 
tion (ex. 
2031). Some 


of the differences can be attributed to data quality, mod- 


elling procedure, or selection effects (2021). 
The major axes are usually well aligned (< 10°) between 
the mass and light distributions (e.g., 
Bruderer et al.}/2016}{Shajib et al.|{2019}/Shayib 


et al.||2021). The cases where the major axes of the mass 
and light do not align within ~ 10° also generally have 
large external shear. These findings suggest that stellar or- 
bits highly misaligned with the potential can only be sus- 
tained in non-isolated galaxies (indicated by the large exter- 
nal shear), which is consistent with what is found in sim- 


ulations (Heiligman and Schwarzschild|{1979 
7988)[Adams etal.|[2007 2015). 


Although the stellar IMF pertains to the luminous structure, 
we present the strong lensing results on the stellar IMF sep- 


arately in Sect. [4.3] 
4.1.2 Dark matter profile 


Numerical simulations show that the dark matter halos and 
the baryonic mass within them initially follow the Navarro— 
Frenk—White profile before star 
formation begins. However, the baryonic gas has to cool 
down and fall inward for star formation to begin. The con- 
traction in the baryonic matter deepens the gravitational po- 
tential, thus the dark matter distribution also contracts adi- 
abatically in response (Blumenthal et al.| 1986). In the pro- 
cess of adiabatic contraction, the initial radius r; of a dark 
matter particle and its final radius r¢ is related as 


rMi(r) = r¢eMe(72), (18) 


where M(r) is the enclosed 3D mass within radius r (Blu-| 
|menthal et al.|[1986). However, numerical simulations find 
that dark matter does not fully respond to the baryonic in- 
fall according to the theoretical model of 


(1986) (e.g...{Gnedin et al.|[2004] [Abadi et al,]|2010).|Dutton| 
(2007) prescribe a formalism defining a halo response 


parameter v to adjust the degree of contraction (i.e., the re- 
sponse to the baryonic infall) as 


ri =I’ (re) re, (19) 
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Fig. 6 The halo response parameter v constrained by |Shajib et al. 


is consistent with no adiabatic contraction in the dark matter 
halo from the initial NFW density profile. The blue shaded distribution 
corresponds to a model with spatially constant mass-to-light ratio, and 
the red shaded distribution corresponds to a baryonic distribution with 
a mass-to-light ratio gradient. 


where I'(r¢) = r¢/r; is the contraction factor. In this for- 
malism, vy = O corresponds to no contraction and v = 1 
corresponds to fully responsive contraction according to the 


model of|Blumenthal et al.|(1986). The simulations of|Gnedin 
let al.|(2004) point to v ~ 0.8, and those of|Abadi et al.|(2010) 
to v ~ 0.4 (Dutton and Treul|2014}. 


Several studies have combined additional information, 
e.g., stellar kinematics and weak lensing, with the strong 
lensing data to decouple the dark matter distribution and 
the baryonic distribution from the total density profile that 
strong lensing can constrain. adopt 
multiple dark matter contraction models with fixed v to val- 
ues between -0.5 and 1 and find that v = 0, i.e. no contrac- 
tion, provides the best match to the data from the SLACS 
sample among all the adopted models. 
allow for a fully variable vy parameter within -0.5 and 1 in 
their dark matter model, and find the average contraction 
in their sample to be (v) = —0.0310-. which is consis- 
tent with no contraction and rules out the contraction results 
from simulations with high statistical significance (Fig. [6). 


4.2 Galaxy evolution 


Observations of elliptical galaxy structure at different red- 
shifts — using strong lensing constraints from HST imaging, 
or IFU kinematics, or both — have provided important in- 
sights into formation pathways of elliptical galaxies. How- 


ever, observations of distant galaxies at different redshifts 
are by nature snapshots of different galaxies at different times. 
To form a statement on the assembly history of these galax- 
ies, it is necessary to first identify evolutionary trends (or 
the lack thereof) in galaxy properties. To interpret the pres- 
ence (or absence) of such evolutionary trends in terms of 
astrophysical processes, numerical simulations have been 
an essential tool. Cosmological and hydrodynamical sim- 
ulations adopt different prescriptions for gravitational and 
baryonic interactions, and by comparing the properties of 
the simulated galaxies with those of the observed ones, we 
can identify the appropriate interactions or processes at play 
to assemble the elliptical galaxies under consideration. From 
cosmological N-body simulations, it has been evident that 
the elliptical galaxies embedded in their dark matter halos 
are the end results of hierarchical mergers of smaller dark 
matter halos (e.g. 
[1996). As a result, the structure of the baryonic and dark 
matter density profiles observed in these galaxies carry in- 
formation on their assembly history. 


Here, we state some of the key observations from strong 
lensing studies that shed light on the formation and evolu- 
tion of massive ellipticals. The observed “bulge—halo con- 
spiracy” described in Sect. [4.1] requires a particular mix- 
ture of adiabatic contraction and feedback-driven expansion 
along the galaxies’ formation history. The near isothermal- 
ity of the total mass distribution can be explained as the 
end result of collisionless accretion of matter in gas-poor 
mergers, where after achieving isothermality further merg- 


ers preserve this feature (e.g., 
fon et al [2005] Remus et al.|[2013)|Wang et al][2019). As 
a result, gas-poor mergers appear to be the primary forma- 
tion mechanism for massive ellipticals 
(2009). However, dissipational gas-rich mergers cannot be 
completely ruled out due to the observed existence of con- 
tracted central dark matter profiles (Sonnenfeld et al.|/2014). 
Furthermore, simulations predict that the dark matter distri- 
bution would adiabatically contract due to the baryonic in- 
fall through cooling at the onset of star formation (z ~ 2) 


(Blumenthal et al.}|/1986 1994 
2004 2004) |Abadi et al.) /2010). However, 


strong lensing studies find the dark matter halos consistent 
with the “un-contracted” NFW profile at z ~ 0.2 
(2021). To reconcile the pre- 
diction from simulations for the dark halo shape at z ~ 2 
with the observed one at z ~ 0.2, the presence of bary- 
onic feedback mechanisms is necessary, which can expand 
the dark halos by transferring baryons outward. Although 
the relative importances between the stellar, supernova, and 
AGN feedbacks in the evolutionary track of ellipticals have 
not been conclusive yet, comparing structural properties of 
simulated and observed ellipticals suggests that AGN feed- 
back has been non-negligible in addition to stellar and su- 
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pernova feedbacks 2017}|Shajib et al.|/2021). 


This scenario is consistent with the findings from the IIlus- 
tris TNG hydrodynamic simulation, where the density profile 
gets steeper than isothermality until z ~ 2 during the ini- 
tial baryonic infall, then primarily AGN feedback makes the 
profile close to isothermal again until z ~ 1, after which the 
elliptical galaxies passively evolve primarily through gas- 
poor mergers that preserve their near-isothermal nature with 


little to no change in the logarithmic slope 
(2020). 

We can summarize the above discussion on formation 
history of ellipticals as follows (see also[Koopmans et al.| 
(2006). The initial mass density profile of the dark matter 
and baryons in halos follow the NFW profile. The baryons 
fall inward through cooling and trigger star formation, and 
as a result of the baryonic infall the dark matter halos con- 
tract up to z ~ 2. Then, several feedback mechanisms — i.e., 
stellar, supernova, and AGN — counteract the initial adia- 
batic contraction by transferring baryons outward and thus 
expanding the dark halo shape back to the NFW profile. 
The dark matter halo and the embedded galaxy grow in size 
through a few major mergers and multiple minor mergers, 
most of which are gas-poor and thus dissipationless 
(2012). These mergers lead to 
the galaxy being pressure-supported and tri-axial in shape, 
thus forming ellipticals out of disky or spiral galaxies. Dissi- 
pationless mergers lead to isothermality in the total density 
profile, and such mergers conserve the isothermality after it 
has been achieved. 

Nevertheless, there remains a number of open issues and 
potential tensions in investigating galaxy evolution through 
comparing simulations with the observed ones using strong 
lensing. We discuss these open issues in Sect.[6.5] 


4.3 Stellar initial mass function 


When combined with ancillary data, strong lensing provides 
a method to infer the stellar mass-to-light ratio (M,./L) 
2010), which can then be related to the low-mass end 


of the stellar initial mass function (IMF) slope in the lens- 
ing galaxies (for a review, see [Smith] {2020). This is mainly 
because low-mass stars (My, < 0.5Mo) contribute only by 
a few percent to the integrated light in the optical but they 


give a much larger contribution to the mass (Conroy and van| 
[Dokkum 2012}. 


As described in Sect. the total mass distribution in 
the lensing galaxy can be decomposed into the stellar and 
dark contribution by combining the lensing and dynamical 
observables, and thus the total stellar mass ML can be ob- 
tained. An independent method to get the stellar mass M$?S 
of the galaxy is the stellar population synthesis (SPS) method 
applied on the lens galaxy’s photometric or spectroscopic 


data (Spiniello et al.||2011||2012). The M,/Z computed via 


SPS analysis depends on the choice of the IMF slope. In 
particular, a bottom-heavier IMF implies a larger M,./L, be- 
cause dwarf-stars contributed more to the mass than to the 
light. Hence, the mismatch parameter: ayyp = ML? /MSPS. 
can be used to infer the lightness or heaviness of the IMF 


2010). For example, if a light IMF — such as 
the Chabrier IMF (Chabrier||2003) — is adopted in the SPS 


method, then a@zyr ~ 1 would point to the Chabrier IMF be- 
ing consistent with the lensing and dynamical observables. 
However, @tyr ~ 2 would indicate that the IMF in the lens 
galaxies is bottom-heavier (i.e characterised by a larger num- 
ber of dwarf stars), with a slope more similar to that of a 
Salpeter IMF [1955), or even steeper. Note that 
M* may depend on a number of lens modelling assump- 
tions, such as the choice of dark matter density profile, the 
stellar mass-to-light ratio being spatially constant or vary- 
ing, and the assumed anisotropy profile of the stellar orbits 
in the dynamical modelling (e.g., see the discussions in|Son-| 


nenfeld et al.|/2018}/2019). 


One of the greatest advantages of lensing, when com- 
bined with dynamics and stellar population analysis, is that 
it allows not only the inference of the IMF slope but also 
to put constraints on the low-mass cutoff (Miww). 
has shown this by studying two strong lenses 


from the X-Shooter Lens Survey (Spiniello et al.) |2011) for 
which both HST imaging (for precise lens modelling) and 


X-Shooter spectra (for stellar population analysis) were avail- 
able. Chromatic, microlensing-induced flux anomalies in a 
galaxy—quasar strong lens can also be used to constrain the 
stellar IMF (2014). This particular tech- 
nique is described in Vernardos et al. (in preparation), which 
discusses the theory and applications of microlensing. 


Whereas the IMF within the Milky Way is light — ie., 
consistent with the Chabrier IMF regardless of the stellar 


population age and environment 
— the majority of strong lensing studies on el- 
liptical galaxies report consistency with a heavier IMF slope 
(ee Spinielo etal ]2O1T] Sonnenfetd et al, 2012) [OTdham 
(2018). For instance, the SLACS analysis — by 


combining lensing and dynamics — finds the IMF in ellipti- 
cal galaxies at mean redshift (z) ~ 0.2 to be consistent with 


the Salpeter IMF, i.e., amr ~ 2 (see Fig.[7| 
also, [Grillo et al.[2009). This result is reproduced with more 
flexible models for the same SLACS systems or a subset of 
them (Auger et al.|[2010}/Shajib et al.|/2021). This is also in 
agreement with the more general findings, based on dynam- 
ics or SPS analysis only, that the IMF is bottom-heavier for 


more massive galaxies in general (Cappellari et al. 2012}|Lal 
2013}|Spiniello et al.||2014).The general pic- 


ture is that the low-mass end of the IMF slope might not be 
universal across all galaxies, as generally assumed in the last 
30 years. However a consensus on the physical mechanisms 
that could be responsible for its variation has not yet been 
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reached. According to theoretical work (e.g. 
[Chabrier et al./[2014), high density, temperature and turbu- 
lence of the gas are key parameters that drive the fragmen- 
tation of molecular clouds (higher density and temperature 
make the fragmentation easier, forming more dwarf stars, 
i.e. a bottom-heavier IMF). 


In contrast with the above mentioned studies on mas- 
sive, local elliptical galaxies, {Sonnenfeld et al.|(2019) find 
log jg @imr = —0.04 + 0.11 with respect to the Chabrier IMF 
in the SPS method for a sample of strong lensing galaxies 
at z ~ 0.6, which is in tension with [Shetty and Cappellari| 
that reports consistency with a Salpeter IMF for a 
galaxy at z ~ | from dynamical analysis. However, allow- 
ing spatial gradients in the M/L can alleviate this tension 
(2019). Indeed, a radial gradient 
in the M,/L, or equivalently a radially varying IMF, has 
been reported by studies based on the SPS method applied 


on local massive ellipticals (Martin-Navarro et al.| |2015 
2017 2018 
2021a\b). These authors find that the central region within 


~ 2 kpc has a heavy IMF (even super-Salpeter), and the IMF 
in the outer regions gradually becomes light (i.e., Milky- 
Way-like). The current belief is therefore that the IMF is 
bottom-heavy for stars formed very early on in cosmic time, 
via a quick and violent SF burst. These stars usually form 
what is often called a “red nugget” (z ~ 2, 
(2011): an ultra-compact red and dead massive core. Then, 
with a second and more time-extended phase, red nuggets 
merge, interact with other structures in the Universe, and ac- 
crete gas. This causes a growth in size up to a factor of ~ 5, 
and (only slightly) in mass, and transforms them into gi- 
ant local, massive ellipticals. Depending on the merger his- 
tory of each single galaxy, the red-nugget can remain (al- 
most) untouched in the innermost region, hence dominating 
the light there. In this case one would measure a bottom- 
heavy IMF in this region. This is what seems to be the case 
for NGC 3311, the central galaxy in the Hydra cluster 
[bosa etal 202Ta), M87 and many other 
very massive low-z early-type galaxies. However, the red 
nugget can also be destroyed or contaminated by accreted 
or lately formed stars. In that case, one would measure a 
Milky-Way like IMF, which is the one characteristic for stars 
formed later on and through more time-extended star forma- 
tion channels. This “two-phase formation scenario” (Naab| 


2009) is also supported by the discovery of “relic 
galaxies" (Trujillo et al.) |2014} |Spiniello et al.) 2021), the 


local counterparts of red nuggets that somehow completely 
missed the size-growth and evolved passively and undis- 
turbed across cosmic time. The IMF slope for these peculiar 
and rare objects has been measured to be steep everywhere 
up to at least one effective radius (Ferré-Mateu et al.|/2017). 
Finally, simulations have lately shown that not all elliptical 
galaxies formed via this two-phase formation scenario (e.g, 


2021), although this becomes more and more 


common with increasing stellar mass. 

The finding of the Salpeter IMF from combination of 
lensing and dynamics in the near-by Universe are consistent 
with this scenario, as strong lensing information is sensi- 
tive to the inner region of the galaxy (typically < 6 kpc). 
However, at higher redshift, lensing probes larger and larger 
region, which explain the results presented in 
(2019). A radially decreasing M,/L also explains or 
alleviates the reported tension between lensing-based stud- 


ies themselves (Sonnenfeld et al.||2018}|2019}|Shajib et al. 
2021). 


Furthermore, studies of few local massive lenses for which 
the Einstein radius is much smaller than the effective radius, 
and hence where the stellar-mass dominates the lensing in- 
ference, indicate that bottom-heavy IMFs are excluded by 
lensing 
(2016). The only way that has been proposed so far to allevi- 
ate this disagreement is to allow for a variable cut-off in the 
low mass end of the IMF (Newman et al.|[2017). 

In conclusion the currently preferred scenario sees the 
majority of massive galaxies having a bottom-heavy IMF 
in their innermost region, where the pristine stellar popula- 
tion formed at z > 2 through a star formation burst domi- 
nate, while stars in outskirt are distributed by a Milky-Way 
like IMF. However, depending on the single galaxy detailed 
merger tree and cosmic evolution, the IMF can differ from 
system to system. 

In the future, [FU-based stellar kinematics and popula- 
tion analysis of strong lensing galaxies in combination with 
lensing constraints can shed light on the presence/absence 
of IMF variations and spatial gradients in ellipticals at inter- 
mediate redshifts. However, to properly track any evolution 
in the IMF properties of elliptical galaxies across redshift, 
it would be essential to mitigate systematic impacts through 
uniform choice of models and to account for selection dif- 
ferences between samples. 


4.4 Constraints on the very central densities and 
super-massive black holes from central images 


Gravitational lensing theory predicts that the number of mul- 
tiple images must always be odd. In systems with 3 (or 5) 
images, 2 (or 4) are formed roughly at the Einstein radius 
from the lens center, which is < 2” for galaxies. The odd 
3rd (or 5th) image is formed very near the center of the lens. 
It is always demagnified, usually significantly, and superim- 
posed on the light of the lensing galaxy, making it hard to 
detect. Most existing searches for central images of strongly 
lensed quasars (given that they are much brighter sources) 
rely on optical, or radio wavelengths. The radio wavelengths 
are most favourable as the lensing galaxy is generally trans- 
parent in that range, but these investigations are limited by 
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Fig. 7 Summary of measurements on the stellar IMF from lensing and 


dynamical observations (Smith et al.|/2015). 


the fact that quasars are generally radio quiet. Out of ~ 200 
doubles discovered to date, only 2 have observed central im- 
ages where the lens is a single galaxy] PMN J1632—0033, 
demagnified to 0.004, or by 6 mag compared to the brightest 


image 2004), and PKS 1830-211, demagnified 
to 0.007, or by 5.4 mag (Muller et al.||2020). Of the ~ 50 


known quads with a single lensing galaxy, no reliable detec- 
tion exists. Upper limits have also been measured. For exam- 
ple, {Quinn et al.|(2016) place an upper limit of ~ 10~* on the 
magnification of the central image with respect to the bright- 
est visible image in the double B1030+074. Upper limits on 
the flux of the central image have also been placed at X-ray 
wavelengths for several quads and doubles. The stacking of 
X-ray monitoring data allows to achieve effective exposures 
of several hundreds of kilo-seconds without any contamina- 
tion from the lens, however the data are still too shallow to 
strongly constrain the lens galaxy inner density profile. One 
of the deepest upper limits has been achieved for HE 0435- 


1123 (e.g.,{Chen et al,] 2012 2017). 


Detection of central images, or the lack thereof, has been 
used to place constraints on the central mass density of the 
lensing galaxies and the mass of the central SMBH 
ture prospects for detecting supermassive binary black holes 
due to their lensing effects are discussed in{Li et al.| (2012); 


Hezaveh et al.|(2015). 


4.5 Spiral galaxies 


Due to their lower mass and presence of a disc, spiral galax- 
ies have a substantially lower lensing cross-section than el- 
liptical ones (Keeton and Kochanek] |1998). Only a handful 
of spiral galaxies lensing quasars have been discovered to 
date. The most well-known and the first example of such 
a system is the Einstein cross, which is in fact lensed by 
the bulge of a nearby spiral galaxy [1985). 
Targeted searches for galaxies lensed by a spiral have been 
carried out, allowing the discovery of several dozens of sys- 
tems and reference therein). Complica- 
tions in the study of those systems arise from the dust and 
the disc mass component. Correcting for the reddening by 
dust is needed to model extended lensed images, which rely 
on the conservation of the surface brightness between the 
lens and the source plane. On the other hand, the disc com- 
ponent yields strong discontinuities in the gravitational po- 
tential and needs to be explicitly modelled using, e.g., an 
exponential mass density profile. Whereas disc-like features 


can yield flux anomalies in lensed quasars (Hsueh et al. 
2016} |2017 2017), they can be disentangled 
from the dust using multi-band data (Moller et al.| |2003). 


Once these ingredients are accounted for, the combination 
of kinematic and lensing information can be used to break 
the degeneracy — that exists in dynamical studies alone — 


between the disc and halo components (Maller et al.||2000 
[Dutton etal 2011 2072}. 


5 Applications in cosmology 


In this section, we present applications of galaxy-scale lens 
systems to measure cosmological parameters without us- 
ing the time-delay information. Measurement of the Hub- 
ble constant (Ho) and other cosmological parameters based 
on the time delays are discussed in detail in Birrer et al. (in 
preparation). 

Here, we briefly establish some necessary definitions for 
use in this section. More detailed explanations of the cosmo- 
logical connection with strong lensing formalism is given in 
Saha et al. (in preparation). The angular diameter distance 
between two objects at redshifts z; and zz (with z; < z2) for 
a flat Universd)]is given by 


Cc Z2, dz’ 
Dan (Z1, 22) = { > 
Ho(1 +22) Jz, E(Z’) 


where E(z) = H(z)/Hpo is the dimensionless Friedman equa- 
tion given by 


(20) 


E(z) = Von +z)? + O11 + 24 + Qae(1 + 23+ Wee), (21) 


4 There are at least 5 cases of central images where the lens has two 
or more main lensing galaxies. 


> See Saha et al. (in preparation), Section 2.1.1 for the general defi- 
nition in the non-flat case. 
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On 


Fig. 8 Example extraction of cosmological parameters from the dou- 
ble source-plane lens system SDSS J0946+ 1006 (shown in Fig.22). The 
68, 95, and 99.7% confidence intervals on the 2,,—w plane from this 


double-source plane system are shown in red and are from|Collett et al. 
(2012), while the black contours are the constraints from Planck 2013. 


Here, Qn, 2;, and Qa. represent the matter, radiation and 
dark energy density parameters, respectively, at z = 0. The 
parameter We is the equation-of-state parameter of the dark 
energy given by Wage = Pae/Pacc?, where page and Pac de- 
note the pressure and density of the dark energy, respec- 
tively. In the ACDM model, wae = —1 is assumed, i.e., the 
dark energy density stays constant through cosmic time. The 
wCDM model is one natural extension of the ACDM model, 
where wae # —1 is allowed, however, wae < —1/3 should 
still be satisfied to reproduce an accelerated Universe. 

In the next subsections, we discuss methods to estimate 
cosmological parameters (primarily, Qn, Que, and Wae) US- 
ing multiple-source-plane lenses (Sect.|5.1), using the stellar 
kinematics of strong lenses (Sect. [5.2p, and using galaxy— 
galaxy lensing statistics (Sect.[5.3). fgdsg 


5.1 Utilizing multiple sources at different redshifts 


Strong lens systems with multiple sources (i-e., compound 
lenses) at different redshifts can be used as cosmographic 
probes. Currently, only a small sample of galaxy-scale com- 


pound lenses are known (e.g., 
jet al.|/2008} [Collett and Smith] |2020), but at the HST snap- 
shot depth they are expected to occur in about 1% of galaxy- 
scale strong lenses (2008). Indeed, if one 
was to stare deeply at any single plane lens, further sources 
would almost inevitably be discovered, as is spectacularly 
demonstrated by the discovery of a third source (a z = 6 
Lyman-a emitter) behind the Jackpot lens in deep Multi Unit 
Spectroscopic Explorer (MUSE) data from the Very Large 


Telescope (Collett and Smith} |2020). Forthcoming surveys 


are expected to discover O(1000) compound lenses. 

Since the Einstein radius is a function of the lens mass 
and the cosmological distances, the ratio of Einstein radii in 
a compound lens with sources at two or more redshifts is 
independent of the mass 
(2012). In practice, the method also requires a complete un- 
derstanding of the lens density profile and of additional lens- 
ing by the source galaxies themselves and other perturbing 
masses along the line of sight. 

For a two-source-plane system with one primary lens, 
the lens equation can be written as 


y =x — B)2@4(x), 


(22) 
Z =X — Q(X) — A (K — Bir@q(x)), 


where x are positions on the image plane, y and z are the un- 
lensed positions of the first and second source, respectively, 
@q is the deflection caused by the primary lens, a; is the 
deflection caused by the closer of the two sources, and fj; is 
the cosmological scaling factor 


Bij = os 
D,Dis 


(23) 


For realistic redshifts and cosmologies, /;; is sensitive to the 
matter fraction and the equation-of-state parameter Wqe, but 
has no dependence on the Hubble constant Ho. 

Amongst galaxy-scale compound lenses, only the Jack- 
pot lens (2008), shown in Fig. [2| has been 
used to precisely constrain cosmology, since this system has 
a favourable redshift configuration — other compound lenses 
have the multiple sources that are at similar redshifts, thus 
having 8 ~ | regardless of the cosmology. [Collett and Auger| 
model the HST imaging, performing a pixellated re- 
construction of both sources, and make a 1.1% measurement 
on B-!. Converting this into constraints on the dark energy, 
this single compound lens with a CMB prior from [Planck] 
constrains Wg. to 0.2 precision, 
as shown in Fig. [3] With hundreds of compound lenses ex- 
pected in the Vera Rubin Observatory Legacy Survey of Space 
and Time (LSST) and the Euclid surveys, constraints on 
both wae and its redshift derivative are expected to be com- 
petitive with established cosmological probes. 


5.2 Utilizing stellar kinematics of single source lenses 


The enclosed projected mass inside the Einstein radius is in- 
dependent of the mass profile (Schneider et al.||1992). Com- 
bining this with the finding that the density profile of ellip- 
tical galaxies is very close to isothermal (see Sect. [4.T]and 
references therein), it is sufficient to model lens galaxies as 
singular isothermal ellipsoids (SIE) when only a constraint 
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on the Einstein radius is needed. For a singular isothermal lenses. Therefore, it was expected that lensing rates should 


sphere (SIS), we have be suppressed for larger values of 2,. Further information 
2D is contained in the Einstein radius distribution and the lens 

Og = 4n(8) = 5 (24) and source redshift distributions. In practice this topic has 
Ds fallen out of fashion due to the cosmological sensitivity be- 

where Gj, is the velocity dispersion of the deflector. Us- ing overwhelmed by astrophysical uncertainties of the un- 
ing this equation, one can constrain the distance ratio d°*> = _ lensed source population, the lens discovery selection func- 


Dgs/Ds using the observed Einstein radius 6; and assum- _ tion, and the lensing properties of typical galaxies (e. g..|Mitchell 
ing that the observed velocity dispersion oo; = Osis. Thus, 2005 2010). 


a sample of single-source lenses with measured oj, can 


be used to estimate cosmological parameters (Grillo et al. 
2008) by maximizing the likelihood function 6 Open problems and future outlook 


1 Xs [a (Za, Zs; O) — dg, ores) In this section, we discuss the currently open problems that 
log £(@) = - 5) >, 5d) » (25) are expected to be tackled in this decade: the selection func- 

el ea? tion (Sect. (6.1), the self-similarity assumption (Sect. (6.2), 
where @ is the set of free parameters in the assumed cosmo- _— degeneracies in strong-lensing (Sect. |6.3) and non-strong- 
logical model, Nsz is the number of lens systems, and dd’ _ lensing observables (Sect.|6.4), comparison with galaxy sim- 
is the uncertainty of each d°®’ that depends on the ojo; and ulations (Sect. [6.5) and statistical methods to perform such 
Og uncertainties. comparisons (Sect.[6.6). We also provide future outlooks on 

Similar to the multiple-source lens systems, this method __ these issues whenever appropriate. 

is also independent of Ho. However, the assumptions in this 
method should be carefully scrutinized: (i) the measured Og 
is independent of the choice of the lens mass profile (e.g., 


2015), and (ii) the measured line-of-sight veloc- 


ity dispersion is equal to that for a SIS profile. In the early 


years, |Ofek et al.|(2003) estimated that systematics might be properties, the lensing phenomenon is a rare occurrence re- 


up to 20% in 6g, however more recent modelling methods quiring a serendipitous alignment of two line-of-sight ob- 
provide robust 6g measurement within a few per cents re- jects separated by a large cosmological distance. Thus, sam- 


ecidibas- Gt thetiads pronlechaiee 021 ) [Trev et al] les of strong lensing galaxies inherently occupy a tiny frac- 
{2006} argue that oo, = sis for those lenses that are ellipti- tion of the population of all galaxies. When strong lensing 


cal galaxies with velocity dispersions in the range 200-300 SILC int No Ine? propenues OF the. poner Populaion et 
km s~!. These results are further confirmed by the analysis galaxies based on such small samples of galaxies, the se- 


of other samples (Bolton et al.| 2006 F010). lection function for the lens sample must be taken into ac- 
Someunthansdlearand Metaloote 5000) Count. Is strong lensing selection inherently biased towards 


find that certain sub-samples provide nonphysical values (1.e., ane ane iene ele it eae Secaee an ie oii 
d* > 1), providing cosmological parameters that are in- to easier discovery? Does the particular search algorit 


consistent with other observations (such as SNIa, CMB). (.e., based on spectroscopic lines, or based on image mor- 


(2020) suggest that those nonphysical values phology) performed over a survey area limit the selection 
are related to other kind of uncertainties in the lens system to a particular sub-population of galaxies? These questions 


such as complex lens substructure, uncertain redshifts, etc. need to be taken into account before generalizing properties 


When these systems are excluded, the estimated cosmologi- inferred in samples of strong lenses to the entire population 
cal parameters are consistent with those expected from other of galaxies. Methodologically, selection functions have been 


observations (Magaria et al.|[2014 2014|(Aghal imincorporated by some studies in either of the following two 


et all 2018] [Motta et al D021). ways: (i) a parametrized selection function in Bayesian hi- 
erarchical analysis (e.g., Sonnenfeld et al. |2019), and (ii) 


posterior predictive tests based on mock samples applying 


5.3 Utilizing galaxy—galaxy lensing statistics the same selection procedure as the actual lens sample (e.g., 
Sonnenfeld et al.}/2018). 


6.1 Selection function 


Although strong lensing is a powerful probe to study galaxy 


The statistics of strong lensing were initially expected to be 
powerful for probing the cosmological parameters (Fukugita] 6.1.1 Parametrized selection function 
1990). At the most basic level, matter clusters can col- 


lapse to the densities required to form strong lenses, whereas _‘ The selection function affects the prior probability p(w; | v) 
the cosmological constant does not cluster and cannot form __ of the observed lens model parameters wj for the i-th lens in 
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a sample, given population hyper-parameters v, through the 
selection criteria applied on a set of lens and source galaxy 
properties u; as 


P(Wi | VY) = Pintr(Wi | Vv) Paet(Wi, ui | V), (26) 


where Pintr(w; | v) is the intrinsic distribution of the lens 
model parameters w; for a randomly selected galaxy, and 
Paet iS the probability of detection. Here, the specific prop- 
erties in the set u; — that Paet is sensitive to — depends on the 
particular search algorithm. For example, most search algo- 
rithms discover brighter systems more easily and hence the 
stellar mass M, of the lens and source galaxies, which is 
a proxy of the total flux (magnitude) of the system, can be 
included in u;. The parameter sets w; and u; can have com- 
mon members depending on the particular parametrization 
of the lens model. The reader is referred to|Sonnenfeld et al. 


(2015}|2019) for specific examples of parametrized selection 
functions. 


6.1.2 Posterior predictive test 


If the selection function is too complicated to express in a 
functional form, a posterior predictive test can be performed 
to evaluate the appropriateness of the adopted model in re- 
trieving accurate hyper-parameters. In this test, the same se- 
lection criteria is applied to the parent sample of elliptical 
galaxies to create multiple realizations of mock strong lens- 
ing samples. Then, the posteriors p(v | D) are compared be- 
tween the real sample and the mock samples. If the posteri- 
ors are in agreement, then the appropriateness of the adopted 
model in recovering the correct hyper-level parameters is 
validated. The reader is referred to 
for a specific example of posterior predictive test on a lens 
sample. 

Large lens samples curated from multiple search algo- 
rithms (e.g., the sample used in|Shajib et al.|(2019) may not 
allow for a simple parametrization of the selection function. 
In such cases, the posterior predictive test can be applied. 
However, that comes with an increased computational bur- 
den in the analysis. In the future, lenses discovered from a 
single survey data — e.g., from the Vera Rubin, Euclid, and 
Roman observatories — through a uniform searching algo- 
rithm can provide a large sample with sufficiently high sta- 
tistical constraining power. However, the necessity to prop- 
erly consider a selection function cannot be circumvented. 


6.2 Assumption of self-similarity 


When hierarchically inferring population properties of lenses 
by combining multiple strong lensing systems (Sect.|3.4), it 
is crucial to describe and quantify potential differences be- 
tween subsets of the considered sample for an accurate pop- 
ulation level inference. Differences in the population may 


arise from the sample selection due to search criteria and 
techniques (Sect. |6.1), or due to intrinsic differences in the 
sources. Such differences in secondary selection might im- 
pact and bias population level constraints when assuming 
that two different samples can be described with an identi- 
cal underlying population. For example, (2020) 
present a hierarchical analysis under the assumptions that 
the SLACS and the Time-Delay COSMOgraphy (TDCOSMO) 
lenses are described by the same population level parame- 
ters, in order to better constrain the mass density profiles of 
the time-delay lenses. There are different ways to mitigate 
such assumptions. The first, is to select a purified sample 
of lenses as self-similar as possible when performing hi- 
erarchical analyses. However, although the currently small 
number of known lenses will increase by future surveys and 
enable this approach, it is still unclear in which parame- 
ter space these lenses can be considered self-similar and 
whether there are important latent variables to consider. The 
second way is to model and describe differences between 
populations on a first-principle level, folding in differential 
selection effects and other aspects into the analysis (as stated 
in Sect.|6. I). 


6.3 Degeneracies in strong-lensing observables 


The ability of the lensing data to constrain the radial struc- 
ture of the lens, and in particular to disentangle dark and 
luminous mass components, is strongly dependent on the 
data quality. As discussed in Saha et al. (in preparation), the 
modelling of point-source astrometry mostly encodes infor- 
mation on the quadrupole moment of the lens, but provides 
limited constraints on the monopole (the total mass). This, 
however, does not mean that any monopole model combined 
with a quadrupole component will accurately reproduce a 


set of lensed image positions. For some choice of the monopole, 


extreme (unphysical) values of the quadrupole may be re- 
quired, naturally excluding some mathematically correct so- 
lutions. This explains why, e.g., a single component con- 
stant mass-to-light ratio model generally yields large shear 
amplitudes to reproduce the observed astrometry of lensed 
systems to high accuracy (e.g., (2012). On the 
other hand, while extended lensed images may provide de- 
tailed azimuthal information, and allow one to constrain the 
ratio of radial magnifications at different galacto-centric dis- 
tances, their effective constraint on the density profile re- 
mains sensitive to the MSD (e. g.,{Sonnenfeld| |2018). While 
the MSD provides a mathematically large leverage to mod- 
ify the results, its impact may remain in practice generally 
small, with a typical change on the total density profile that 
may not exceed a few tens percent. The chosen prior on 
the mass profile (i.e., choice of families of mass distribu- 
tion or free-form with some regularization) may further limit 
the impact of degeneracies. One may however need to be 
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careful when choosing a mass distribution as a too rigid 
model compared to the true mass density may yield too bi- 
ased or underestimated parameter uncertainties (e.g., 
(2021). Finally, it is important to 
note that the presence of point-images with measured time- 
delays limits the impact of degeneracies in lensing-only ob- 


servables for an assumed cosmology (Saha and Williams 
2001 2002). 


The constraint on the power-law slope of galaxies de- 
scribed in Sect. [4.1] results from the combination of lensing 
data and kinematics data. The lensing data either consist of 
an estimate of the Einstein radius (e.g. [Bolton et al.| [2006), 
or is based on models of extended images. By definition, 
the power-law index requires to assume a power-law density 
model for both lensing and kinematics data. For that reason, 
it is insensitive to the mass-sheet degeneracy. The compar- 
ison of power-law indices of lensing galaxies as a function 
of redshift requires, however, to be handled with care. In- 
deed, as the lens and source redshifts vary, lensed images 
are formed at different galacto-centric radii, such that small 
deviations from the power-law model may mimic an (ab- 
sence of) evolution of the galaxy total density profile with 


redshift. To account for this effect, (Dutton and Treu}|2014) 


suggest the use of a mass weighted slope. 


6.4 Degeneracies in non-strong-lensing observables 


The modelling of stellar kinematics data requires an assump- 
tion on the anisotropy profile of stellar orbits. Integrated ve- 
locity dispersions obtained from single-slit spectra cannot 
constrain the anisotropy profile, which leads to the so-called 


mass—anisotropy degeneracy (Treu and Koopmans} |2002). 


Typically adopted anisotropy profiles are either isotropic or 
the Osipkov—Merritt profile (Osipkov| {1979} |Merritt}/T985a[b). 
Whereas the isotropy assumption does not entail any free pa- 
rameter, the Osipkov—Merritt profile depends on a scale ra- 
dius. Due to the mass—anisotropy degeneracy, the posterior 
of the anisotropy scale radius is dominated by the adopted 
prior (e.g., (2018). Particular choices of the 
anisotropy profile and the associated prior may lead to sys- 
tematic differences between studies involving a combination 
of strong lensing and kinematics data. For example, 
find the existence of a mass-to-light 


ratio gradient in the SLACS lenses assuming isotropic or- 
bits, whereas (2021) find consistency with a 
constant mass-to-light ratio assuming the Osipkov—Merritt 
anisotropy profile for a subsample of SLACS. 

Spatially resolved velocity dispersion measurements can 
better constrain the anisotropy profile by breaking the mass— 
anisotropy degeneracy. However, spatially resolved kinemat- 
ics data from integral field unit (IFU) spectroscopy are more 
expensive to obtain than an integrated measurement from 
single-slit spectroscopy, thus usage of such data in lensing 


studies has been limited (see, e.g., 2010 
2011) for example of IFU data being com- 


bined with strong lensing). 


6.5 Comparison with galaxy simulations 


The past decade has seen significant progress in our under- 
standing of the structure and formation of galaxies. Within 
the ACDM paradigm, there is general agreement regarding 
the gravitational part of galaxy formation. The ‘gastrophysics’ 
is less well understood and requires subgrid models to simu- 
late, but still the galaxies formed in simulations like Illustris 


( Vogelsberger et al.|/2014), FIRE (Hopkins et al.|/2014), and 
EAGLE 2015) are much more credible than 


previous generations of simulated galaxies. There are also 
equilibrium galaxy models including stars, gas, and dark 
matter, of which the AGAMA simulations 
are arguably the most sophisticated. 

Despite the great advances in the fidelity of the simula- 
tions, there has been a number of discrepancies between the 
simulated predictions and the observed properties of galax- 
ies. In particular, simulations have not been successful yet 
in reproducing the observed distributions of the logarith- 
mic slope yp and the dark matter fraction fam simultane- 
ously. A no or weak feedback prescription was required in 
some of the simulations to reproduce the y,) distribution, 
which, howeve, led to underestimating fam compared to the 
observations due to overestimating the star formation effi- 


ciency (e.g., 2007} [Duffy et al.| {2010} 
2009). Similarly, matching the fam distribution re- 


quired strong feedback prescriptions, but these produce too 
shallow yp; compared to the observed distribution. More re- 
cently, the IllustrisTNG simulation was able to reproduce a 
Sam—Yp1 distribution that is consistent with the strong lensing 
observations, if the stellar IMF corresponds to the Salpeter 


IMF (Wang et al.| {2020} |Shajib et al.||2021). In contrast, 
Mukherjee et al.) (2022), find that the EAGLE and SLACS 


Sam—Yp! distributions agree while using a Chabrier IMF, sup- 
porting the important role played by feedback and sub-grid 
physics in reproducing this relation. 

Furthermore, there has been an apparent tension in the 
redshift evolution of the logarithmic slope yp) between ob- 
servations and simulations. Strong lensing observations re- 
port a steepening of yp) with decreasing redshift at z < 1 
(Ruff etal. 2OTT} Sonnenfeld eta.) 2013). Such a steepen- 
ing would require dissipative processes through wet mergers 
along the evolutionary track of elliptical galaxies. Simula- 
tions instead find no evidence for redshift evolution of yp] 
below z = 1, ora slightly shallowing trend in yp, with de- 


creasing redshift (see Fig. [9] [Xu et al.| 2017} 
2017 2020). This tension vanishes if the same 


strong lensing analysis is applied to the simulated galaxies 
from Ilustris and Magneticum to extract yp| by combining 
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lensing and kinematic information 2017 
2017). Therefore, the modelling systematics in the 


joint analysis of lensing and dynamical observables cannot 
be ruled out as the source of the above tension. 

The specific importance of various baryonic processes 
in the evolution of ellipticals is still an open question, which 
would require further fine-tuning in the simulations to con- 
sistently reproduce all of their observable properties. How- 
ever, modelling systematics and the selection function need 
to be properly taken into account for accurate comparison 
between simulations and observations. The first results of 
the statistical framework outlined in the next subsections 
for inference of the galaxy evolution scenario when jointly 
considering simulations and observations indicate that AGN 


feedback is essential (Denzel et al.|!2021a) but application 


to a large sample of lenses is required. 


6.6 Statistical framework for comparison with simulations 


To circumvent the potential systematics associated with lens 
modelling, an exciting prospect would be comparing the sim- 
ulated galaxies directly with lensing observations. In gen- 
eral terms, the idea is straightforward. Let F stand for some 
formation scenario of galaxies. This could be a particular 
feedback prescription in hydrodynamic simulation, or some 
family of galaxy models in phase space. Scenario F' pro- 
duces simulated galaxies g with a distribution P(g| fF’). Let 
G denote the data on an observed galaxy. The quantity 
P(G|F) = Xi, P(G|g) P(g|F) (27) 
is the likelihood of G in scenario F’. The posterior probabil- 
ity of the formation scenario given the observed galaxy is 


_ PG|F)P) 
PF |G) = PG) (28) 
and the product of this quantity 
P(F |{G}) = [1g PF |G) (29) 


over an entire catalog {G} of observed lensing galaxies is the 
ultimate quantity of interest. 

The observations of a lensing galaxy will not be only the 
lensing data (say Gjens). There will typically be data on the 
stellar light (Say Ghight) and on the kinematics of the stars 
(say Gin), and possibly X-ray emission and other kinds of 
data as well. Let us consider Gjen; first. To assess whether 
some simulated galaxy g can fit some lensing data, we need 
to allow for different orientations of the galaxy and different 
source brightness distributions behind it. Orienting the sim- 
ulated galaxy by some z (which denotes the Euler angles) we 
compute the deflection-angle field a(@). Then we consider 


some unlensed brightness distribution (say uw) that will get 
lensed. The likelihood contribution has dependencies as 


P(Giens | 854, 4) = P(Gtens | @, u) P(ar| g, 0). (30) 


As the source distribution is not of special interest in the 
present context, one would like to marginalize it out 
PCGrens| 8.0) = f PCG 81.0) d G1) 
with a prior on u if desired. The orientation is a nuisance 
parameter and can be likewise marginalized out 
PGrnsls)= { PGiom 8.0. (32) 
The marginalization Eq. over orientations is somewhat 
computationally intensive but straightforward. The marginal- 
ization Eq. [31] over sources, on the other hand, is compli- 
cated. It is necessary to approximate the integral by fitting 
for sources that give non-negligible contributions to the in- 
tegral. A too-small sample of simulated galaxies is likely to 
yield no plausible g at all, or equivalently, that P(Gtens | g) 
might be vanishingly small for all g, no matter what the ori- 
entations and sources. However, (202 1a) find 
that plausible matches for observed lensing galaxies can be 
found in (at least) the EAGLE simulations, and it is possible 
to compute P(F | {G}) for some different feedback prescrip- 
tions. The next stage would be to extend to stellar population 
and kinematics, using 

[ Pui .0) PCG 0) PCGren 8.0 (33) 
to compute P(Gxin, Giignts Giens |g) and hence P(F | {G}) us- 
ing multiple data sets. 


7 Concluding remarks 


In this review article, we have provided a review of the appli- 
cations of galaxy-scale strong lensing in both astrophysics 
and cosmology. Inevitably, some special sub-topics within 
the field of galaxy-scale strong lensing have evolved into 
proper fields of research themselves, having acquired a method- 
ology and literature extensive enough to warrant a dedicated 
review. These are: detecting dark matter substructures and 
linking their properties to the dark matter particle, and mea- 
suring Ho and other cosmological parameters through time 
delays, examined in Vegetti et al. (in preparation) and Birrer 
et al. (in preparation) respectively. 

We started with a brief historical overview in Sect. 
Then, we have discussed both strong-lensing observables 
and complementary non-strong-lensing observables, and method- 
ologies to model and extract meaningful results from such 
data in Sect.|3| The most available and informative data for 
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Fig. 9 Comparison of measured logarithmic slopes of the total density profile at different redshifts between strong lensing measurements (grey 
points), dynamical measurements (red points), and simulation results (lines and shaded region). Figure taken from[{Wang et al.] (2020). Although 
strong lensing observations find steeping of yp, with time below z = 1, simulations find no trend or slightly decreasing trend in yp with decreasing 
redshift. However, modelling systematics cannot be ruled out as the source of this tension. 


galaxy-scale strong lenses come from imaging. We have re- 
viewed the most common modelling methods found in the 
literature to model such data and constrain galaxy proper- 
ties from lenses arcs, and in some cases with the inclusion 
of multiple images of a point-like source. Next, we have re- 
viewed the main scientific results from the literature pertain- 
ing to astrophysics of galaxies in Sect. [4jand to cosmologi- 
cal models in Sect. 5] We then discussed the currently open 
questions and provided future outlook in Sect. |6] 


The open questions presented in Sect. [6] provide excit- 
ing opportunities for the near future. Several large-area sky 
surveys — namely the Vera Rubin, Euclid, and Roman ob- 
servatories — will discover thousands of new galaxy-scale 
lensing systems (Oguri and Marshall 2070) [Collet] 2013) 
Various automated pipelines are being developed to tackle 
the computational aspect of modelling such large lens sam- 


ples (Chan et al.|/2015}|Nightingale et al.|!2018}|Shajib et al. 
2021) |Schmidt et al.| |2022). These treasure troves of data 


will provide the necessary statistical power to shed light on 
the open questions pertaining to galaxy evolution and cos- 
mology. 
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